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-ABSTRACT- 

Thia  pap«r  described  a  data  base  management  system 
under  development  at  Kansas  State  University/  intended  for 
use  in  a  network  composed  primarily  of  minicomputers.  The 
report  presents  a  description  of  the  computers  forming  the 
***t'*ork  and  their  intercomputer  communication  system.  The 
data  base  management  system  is  a  network  type  as  specified 
by  CODASYL.  An  extension  Of  •  a  COOASYL-type  DBMS  to 
multicomputer  configurations  is  presented  and  several  DBMS 
network  topologies  are  discussed.  We  then  conclude  with  a 
discussion  of  a  completely  distributed  data  base  network* 
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Abstract 


This  paper  describes  a  data  base  aanageaent  system  undet  '<3velopment 
at  Kansas  State  Unlversrlty,  intended  for  use  in  a  network  composed  primarily 
of  minicomputers.  The  report  presents  a  description  of  the  computers  forming 
Che  network  and  their  inter-computer  communication  system.  The  data  base 
management  system  Is  a  network  type  as  specified  by  COBASYl..  An  extension 
of  a  CODASYL-type  DBMS  to  multicomputer  configurations  Is  presented  and 
several  DBMS  network  topologies  are  discussed.  We  theu  conclude  with  a 
discussion  of  a  completely  distributed  data  base  netwo/k. 
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I.  Introduction 

Many  organizations  have  the  need  to  expand  their  data  processing  power 
and  simultaneously  Increase  their  data  accessibility.  This  paper  presents 
the  results  of  research  aimed  at  developing  systems  to  satisfy  that  need.  A 
distributed  data  base  management  system  residing  on  a  network  composed 
mainly  of  minicomputers  Is  under  development  by  Kansas  State  University  and 
the  U.  S.  Army  Computer  Systems  Cooimand.  A  distributed  data  base  system  provides 
the  capability  for  a  user  program  to  be  entered  on  any  machine  In  the  system, 
executed  by  another  (or  Che  same)  processor,  and  access  data  on  all  secondary 
storage  devices  attached  Co  Che  network  (provided  security  requirements  are 
met) . 

A  distributed  data  base  system  of  the  form  presented  In  this  report 
provides  an  economical  and  easily  expandable  data  processing  facility  with 
considerable  flexibility  and  computational  power. 

II.  Data  Base  Network 

A.  Hardware  Configuration 

The  network  which  supports  the  distributed  data  base  system  is  composed 
of  six  computers  manufactured  by  four  different  vendors.  This  heterogenous 
blend  of  architectures  requires  that  Che  data  base  and  communication  software 
be  portable.  The  software  systems  presented  In  this  paper  arc  designed  to 
be  sufficiently  general  Co  allow  a  wide  variety  of  different  computers  Co 
be  incorporated  into  the  network. 

The  present  network  configuration  Is  depicted  in  Figure  1.  The  three 
INTERDATA  machines  and  Che  NOVA  reside  In  the  Computer  Science  Department  of 
Kansas  State  Unlvi  .Ify  These  four  local  machines  are  all  directly  connected 
via  high  speed  links.  The  IBM  370/138  Is  located  at  the  KSU  Computing  Center 
,md  la  accessed  via  a  slow  speed  phone  link.  The  PDF  11/70  Is  situated  at 
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IBM 


370/158 


Figure  1 

Network  Configuration 
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Fc.  Belvotr,  VA  (U.  S.  Array  Compucer  Systera  Coramand) .  The  PDF  11/70  and 
INTERDATA  8/32  conununicate  over  conventional  phone  lines. 

B.  Inter-Hachlne  Communication 

Although  the  network  consists  of  a  heterogeneous  blend  of  processors, 
the  means  of  communicating  among  Che  nodes  Is  Identical  at  Che  highest  level 
In  all  cases.  The  Inter-Computer  Communication  System  (ICCS)  acts  as  a 
message  system  to  route  programs,  daCi.,  and  control  Information  among  machines 
and  tasks,  A  functional  overview  of  ICCS  Is  given  In  this  report.  A  detailed 
description  can  be  obtained  In  reference  (1].  ICCS  performs  the  following 
functions: 

1.  Synchronize  cask  and  processor, 

2.  Perform  Inter-process  communication  witlch  enables  exchange  of  data 
aitd  control  Information  between  distributed  Casks, 

3.  Manage  message  buffers, 

4.  Standardize  Che  Interface  between  application  programs  and  service 
programs  (DBMS  tasks)  and, 

5.  Provide  a  uniform  communication  facility  abstracted  above  but 
implemented  on  standard  connections. 

Inter-cask  communications  is  performed  by  SEND  and  RECEIVE  procedures. 

The  SEND  procedure  Identifies  the  name  and  port  within  the  target  task 
addressed  by  a  TO-ID  parameter.  The  RECEIVE  procedure  may  either  receive  a 
message  from  .i  t.ask  Identified  by  a  FROM-ID  parameter  or  accept  a  message 
using  a  flrst-corae-first-serve  discipline.  The  RECEIVE  procedure  can  also 
be  notified  If  the  task  Issuing  Che  receive  Is  to  wait  for  the  message  or  Is 
to  proceed  uncord! t ionally. 

The  interaction  of  ICCS,  communicating  tasks  oo  different  machines,  and 
the  operating  systems  of  Che  machines  is  depleted  In  Figure  2.  TIic  task 
originating  the  communication  Invokes  ICCS  with  a  sequence  of  fixed  length 
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Figure  2 

Inter- Task  Communication  via  ICCS 


block  niessages  which  compose  Che  data  or  control  infotmaclon  to  be  transmitted. 
The  Invocation  Is  carried  out  be  means  of  a  subroutine  call.  The  version  of 
ICCS  residing  on  the  transmitting  processor  either  transmits  the  series  of 
buffers  to  the  receiving  machine  via  the  communication  links,  or  queues  the 
messages  If  It  has  Insufficient  buffer  space  available.  The  message  Is  then 
transmitted  to  the  receiving  processor  which  Is  specified  by  the  message 
identifier.  The  transmission  may  be  accomplished  via  direct  links  or  by  routing 
through  other  processors.  The  message  system  on  the  receiving  processor  reads 
Che  message  Into  any  available  buffers.  If  no  buffers  are  available  the 
message  Is  queued.  The  contents  of  the  buffer  are  then  made  available  to  the 
task  to  which  Che  message  Is  directed. 

When  a  cask  Is  to  receive  a  message  It  Issues  a  call  to  ICCS.  If  the 
message  Is  not  contained  In  the  ICCS  buffers  In  that  processor,  the  task  can 
specify  either  the  wait  or  proceed  option.  If  the  wait  option  Is  Indicated, 

Che  Cask  will  suspend  execution  until  the  ICCS  on  Its  processor  receives  the 
message.  At  this  time  the  receiving  cask  will  be  allowed  to  resume. 

ICCS  may  exist  In  one  of  two  forms  depending  upon  the  level  of  system 
software  that  can  be  utilized.  The  Multi-Computer  Communication  System 
(MCCS)  Is  a  version  of  Che  message  control  system  Intended  to  execute  under 
single  Cask  operating  systems.  The  Inter-Task  Communication  System  (ITC5)  Is 
a  mulcl-casklng  version  which  executes  on'a  system  providing  efficient  skeletal 
Incer-cask  communication  facilities.  MCCS  has  been  Implemented  on  Che  IBM 
370  as  a  CMS  machine  under  VM/370.  Implementation  details  can  be  found  In 
[2].  ITCS  Is  In  execution  on  a  NOVA  2/10  under  RDOS.  Specifications  of  ITCS 
are  given  In  [3]. 

ICCS  has  been  developed  to  provide  a  generalized  communication  mechanism 
for  intertask  comunlcatlon  between  distributed  tasks.  It  can  be  adapted  easily 
to  serve  as  Che  communication  facility  for  a  data  base  m.inagemenc  system.  In 
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a  DBMS  environmenc,  data  base  function  requests  can  be  transmitted  In  one 
direction  and  data  and  status  sent  In  the  other  direction. 

C.  DBMS  Specifications 

The  data  base  management  system  that  is  to  be  distributed  functionally 
among  the  network  nodes  Is  based  upon  the  CODASYL  data  base  specifications. 

The  DBMS  encompasses  virtually  all  of  the  features  listed  by  the  CODASYL 
committee  as  shown  In  Reference  [4],  Complete  language  specifications  for 
this  system  are  available  In  Reference  [S]. 

The  Internal  operation  of  a  CODASYL  DBMS  Is  Illustrated  here  by  describing 
Che  actions  that  occur  when  a  DML  statement  Is  executed.  More  information 
on  the  workings  of  a  CODASYL  DBMS  can  be  found  In  [6-8].  Figure  3  shows  Che 
memory  layout  of  the  DBMS  and  Che  action  sequence. 

1.  A  DHL  command  Is  encountered  in  Che  application  program. 

A  call  to  the  DBMS  Is  then  issued. 

2.  The  DBMS  analyzes  Che  call  and  verifies  the  request  against  the 
object  versions  of  the  schema  and  sub-schema. 

3.  The  contents  of  System  Buffers  are  checked. 

4.  If  necessary,  the  DBMS  requests  Chat  Che  operating  system  perform 
a  physical  I/O  transfer. 

5.  The  operating  system  controls  the  I/O  operation  of  6. 

6.  Data  Is  transferred  between  secondary  storage  and  system  buffers 
by  the  operating  system. 

7.  The  DBMS  transfers  data  between  system  buffers  and  the  User  Working 
Area  (UWA)  as  required. 

8.  The  Di, i.r.,v  ues  status  Information  on  the  recently  completed  operation, 

9.  The  data  In  Che  UWA  may  be  operated  upon  in  amy  manner  by  the  appli¬ 


cation  program. 
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III.  Distributed  Djt.-i  Base  Syatena 

A.  Back-End  DBMS 

A  CODASYX  DBMS  as  originally  conceived  was  targeted  for  a  single  computer. 
However,  a  DBMS  represents  a  significant  drain  on  system  resources  due  to  the 
large  amovmcs  of  computational  activity  and  the  sizeable  number  of  I/O  opera- 
-  tlons  coupled  with  the  operating  system  overhead  Introduced  by  the  requisite 
cask  switches.  Because  of  the  high-level  nature  of  the  DML,  a  single  DML 
statement  can  result  in  several  secondary  storage  accesses,  and  hence  many 
task  switches  are  Inherent  In  a  centralized  DBMS.  A  back-end  DBMS  was  con¬ 
ceived  by  Canaday,  et  al.  [4]  to  relieve  the  processor  of  a  portion  of  Its 
DBMS  workload  and  overhead  by  Incorporating  a  minicomputer  dedicated  to  per¬ 
forming  DBMS  functions  Into  the  configuration. 

The  basic  method  of  operation  of  a  back-end  DBMS  Is  to  restrict  the 
physical  access  to  the  data  base  to  the  back-end  machine.  The  application 
program  Is  executed  on  the  original  (or  host)  computer.  When  a  DML  statement 
is  encountered  In  an  application  program,  a  message  Is  transmitted  to  the 
back-end  computer  via  a  communication  system  such  as  the  ICCS  described  in 
Section  II. B.  A  task  on  the  back-end  computer  then  performs  the  DML  function. 
Including  any  requisite  I/O  operations  on  the  data  base.  In  effect,  the 
back-end  processor  acts  as  a  sophisticated  data  base  I/O  device  for  the  host 
machine. 

The  applicability  of  the  back-end  DBMS  concept  to  production  data 
processing  systems  has  been  investigated  In  [10].  In  that  study  It  was  shown 
that  the  Incorporation  of  a  back-end  machine  Into  a  DBMS  system  reduces  task 
switching  overlivn.'  on  tne  host  CPU,  decreases  the  primary  memory  requirements 
for  the  DBMS  anc  applications  In  the  host,  and  provides  for  an  overall  Increase 
In  Che  availability  of  system  resources. 
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In  order  Co  realize  a  back-end  DBMS,  Che  daca  base  sofcware  shown  In 
Figure  2  musC  be  dlsCrlbuCed  beCween  Che  hose  and  back-end  compucers.  The 
back-end  conflguraclon  Is  shown  In  Figure  4.  Each  ZML  resulcs  In  exchange 
of  Informaclon  beeween  Che  two  machines.  The  DBMS  sofeware  In  Che  hose  becomes 
minimal,  conslsclng  only  of  inCerface  rouClnes  beCween  Che  appllcaclon  program 
and  Che  message  syscem. 

The  back-end  DBMS  unloads  subscancial  amounC  of  processing  requlremenc 
from  Che  hose  co  che  back-end  machine.  This  face  frees  Che  hose  machine  for 
addlclonal  processing.  If  Chls  processing  Is  dedicated  to  addlclonal  DBMS 
appllcaclons,  Che  beneflcs  of  concurrent  operation  of  the  host  and  back-end 
CPU's  can  be  repeated. 

Inter-machine  communication  Is  accomplished  In  basically  che  following 
way.  Whenever  a  DML  sCaCemenc  results  In  a  transmission  to  the  back-end,  an 
InCerface  routine  formats  a  message  which  indicates  the  action  that  the  back¬ 
end  must  Cake  In  order  to  complete  Che  DML  function.  The  message  is  trans¬ 
mitted  by  Che  ICCS  to  che  back-end.  The  back-end  computer  also  contains  a 
routine  chat  serves  as  an  Intermediary  beeween  ICCS  and  the  DBMS  routines 
residing  on  Che  back-end.  This  routine  will  decode  Che  message  and  activate 
che  appropriate  DBMS  function  which  may  result  in  one  or  more  I/O  operations. 
When  Che  DML  function  has  been  completed,  data  and/or  status  Information  Is 
returned  Co  che  application  program  in  the  host  machine.  Figure  5  Indicates 
Che  flow  of  messages  In  the  back-end  DBMS. 

B.  Multiple  Machine  Configurations 

Thus  far,  the  discussion  has  considered  only  a  two-computer  data  base 
network,  i'  c^^s  configuration  can  be  extended  In  several  ways.  An 

Initial  .step  In  this  direction  Is  to  Incorporate  several  back-end  machines 
Into  the  syscem.  In  chls  environment  the  host  serves  as  the  central  element 
of  Che  network.  Any  data  base  access  request  originated  by  the  host  computer 
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HOST  CPU 
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Figure  5 

Information  Flow  In  Back-End  DBMS 


must  be  targeted  to  the  bnck-end  machine  connected  to  the  appropriate  duta 
base.  As  In  all  network  DBMS  topologies,  Intemachlne  coonnunicat Ion  occurs  via 
messages  transmitted  using  the  ICCS.  Figure  6  Illustrates  a  multiple  back-end 
configuration. 

The  basic  back-end  system  can  also  be  expanded  to  allow  several  host 
machines  to  access  a  single  back-end.  In  this  envlronaeat ,  the  back-end 
machine  controls  the  access  to  a  centralized  data  base  which  may  be  referenced 
for  any  of  the  host  machines.  Figure  7  depicts  a  multiple  host  configuration. 

All  prior  discussions  of  the  back-end  machine  have  assumed  that  its 
sole  function  Is  data  base  management.  One  of  the  basic  objectives  of  a  distri¬ 
buted  DBMS  is  efficient  utilization  of  all  system  resources.  It  Is  possible 
that  a  CPU  totally  dedicated  to  the  DBMS  function  could  have  considerable 
periods  of  Inactivity.  Assuming  that  the  back-end  computer  has  multiprogramming 
capabilities,  tasks  other  than  DBMS  functions  could  be  performed  upon  that 
machine.  These  additional  tasks  could  Include  data  base  application  programs. 

A  machine  capable  of  serving  as  both  a  host  and  back-end  Is  known  as  a  bl- 
functlonal  machine.  In  a  network  with  back-end,  host  and  bl-functlonal  machines 
(such  as  that  In  Figure  8)  the  only  restriction  as  to  the  function  of  a 
processor  Is  Its  physical  connection  to  secondary  storage.  A  bl-functional 
processor  does  not  require  any  special  software  other  than  the  DBMS  and  the 
ICCS  code.  If  an  application  program  on  a  bl-functlonal  machine  has  access 
to  the  data  base  controlled  by  that  machine,  messages  are  transmitted  to  and 
from  the  back-end  DBMS  code  on  the  same  machine  via  the  ICCS  of  that  machine. 
This  mode  of  operation  naturally  Introduces  overhead  with  respect  to  a  single 
processor  DB.'.S  il.w.  ver,  Che  benefits  to  be  realized  In  terms  of  generality 
of  function  and  expanded  data  access  make  such  a  confignraclon  highly  desirable. 

The  goal  of  a  distributed  computing  system  Is  Co  balance  the  workload  of 
the  component  computing  elements  while  maximizing  access  and  throughput.  In  a 
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dlstrlbuted  DBMS  network,  each  node  Is  a  bl-f unctional  machine.  Each  node  may 
conaunlcate  with  every  node  In  the  system.  (Although  this  communication  may  be 
realized  by  forwarding  through  Intervening  nodes.)  Figure  9  displays  a  typical 
distributed  DBMS.  In  such  a  configuration,  an  application  program  may  be  sub¬ 
mitted  at  one  node,  executed  at  another,  and  access  the  data  bases  of  other  nodes 
In  the  network. 

Problems  such  as  data  accessibility  and  integrity,  transparency  of  data 
location,  contention  and  deadlock,  communication  and  buffer  management  for  a 
distributed  DBMS  are  considered  In  Che  references  [11-12]. 

IV.  Conclusion 

In  suamary  this  paper  provides  a  description  of  a  flexible  and  powerful 
network  for  data  base  management.  The  report  describes  the  mini-computer 
configuration  upon  which  the  system  resides,  the  Inter-machlne  communication 
facilities,  a  description  of  Che  operation  of  the  data  base  management  system 
and  discussion  of  various  network  configurations  under  which  the  DBMS  can 
operate. 

This  report  presents  a  distributed  data  base  network  as  an  economical, 
general,  and  practical  method  of  enhancing  (or  expanding)  a  data  processing 
facility. 
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